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The author of this paper presents three arguments (philosophical, empirical, and 
linguistic) to make his point that the computer, far from being worthless with words, 
offers the brightest hope for the future management of the verbal processes so 
important in counseling and guidance. Philosophically, he argues, there is no deep 
support for bias against the machine, since, in any auidance situation, exact 
measurement must be taken by whatever means available. Computers can respond if 
there is insistence upon behavioral data rather than data concerning internal states, 
and operational definitions instead of idealistic ones. Empirically, the computer has 
proven itself valuable in many statistical demonstrations done by groups working 
independently of one another. The central linguistic problem appears to be in the 
area of transformational grammar or the relating of one statement to some 
transformed equivalent. Much work is currently being done in the area of approaches 
to meanings in the field of computational linguistics. Since counselors serve as 
information processors, and are presumably operating under "as-yet 

dimly-understood rules," the author feels that they can begin to make some practical 
use of the computer in language analysis. (Author/CJ) 
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In guidance, as in the rest of education, words are the 
coin of the realm. Our ethics and traditions in guidance normally 
restrict us to non-physiological procedures , and words seem to 
be almost all that we have left: We identify problems because 

of the words that students use. We analyze conditions through 
other words they use. We counsel in words, and we evaluate the 
effectiveness of our counsel through the consideration of more 
words. But we don’t do any of this very much, or for many of 
our students, because there are too many of them, and we are 
too busy, and time is too short. This is all part of our tradi- 
tional professional guilt. 

True, the computer is beginning to help us in a number of 
important ways : It scores tests and questionnaires , and can 

make good analyses and predictions. Used properly, it can help 
us greatly with quantitative measures , can discharge class 
scheduling, grade reports, warning notices, absence accounting, 
and other routines. But for verbal interaction with students, 
the computer is apt to seem pretty far out . We cannot believe 
that it can help us much in a counseling interview , or can in 
any other way with our load of words . When it comes to conversa- 
tion, we imagine that computers must be worse than useless. 

The theme of this paper is that the computer, far from be- 
ing worthless with words , offers us the brightest hope for future 
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management: of the verbal processes so important: in guidance. 

To present this theme , I shall present three main kinds of 
argument: first, philosophical: second, empirical $ third, 

linguistic . 

I . Philosophical Arguments 

The philosophical considerations will not take very much 
time here. The main arguments against a computer being used 
in conversation are of two sorts: (1) the alleged inability 

of the computer to handle conversation, and (2) the alleged 
immorality of the computer handling conversation. 

Let us first dispose, if we can, of the immorality argument 
One writer has been very critical of our own work because he 
believes that , in his words , reliability and dependability and 
objectivity are not appropriate goals in the evaluation of stu- 
dent themes. Anyone trained in measurement must insist that 
such traits are very important, whatever techniques are employed 
but it should not be worth space in a professional journal to 
argue this point. 

A second argument to "immorality" is the claim that the 
computer cannot be swayed properly by human emotions and softnes 
of judgment. The word "mechanical" is in this view associated 
with the notions of "coldness" and "unkindness." On close in- 
spection , this argument is seen as a variant of the argument 
to the inability of the computer, and should be so considered. 

Concerning the alleged inability of the computer, the best 
discussion of the question of mental power is probably still the 
1950 classic by the late Alan Turing, 11 ’ which should be read by 
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everyone interested in the question. The principal argument 
here centers , by fairly general agreement , around what Turing 
called “The Imitation Game, 1 ' in which a judge is charged with the 
responsibility for telling whether answers to questions are 
generated by a man or by a machine . The usual consensus is that , 
if a machine can successfully imitate the man in this game , the 
machine can be said to “think 51 — and consequently , be responsible 
for other duties ordinarily considered to require thought . 

The basis of the imitation game should be familiar to any- 
one in behavioral science: It is the foundation of scientific 

psychology, the insistence upon data concerning behavior, rather 
than internal states, and the insistence upon operational defini- 
tions, rather than idealistic definitions. The game is thus 
thoroughly consistent with our own philosophical roots, which 
depend on defined behaviors , rather than on inferred conscious 
attitudes. Given such a commitment, then there is nothing in 
principle illogical about machines thinking, or conversing, or for 
that matter, about machines feeling emotions and expressing per- 
sonal opinion about a counselee’s processes — at least, to the 
degree we are willing to attribute these activities to human 
guidance personnel other than ourselves. 

The fact is that, given suitable programs, the computer is 
today successfully playing Turing T s game all the time — though 
of course not yet with the exhaustive realism which would in 
all circumstances fool the judge. But consider a few of the 
machined current successes in education: certain kinds of 

official letter writing, the scheduling of classes, the reporting 
of test performance, and so on. The extrapolation required to 
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imagine successful playing of the imitation game is much less 
today, than it was in 1950. 

The most comical recent example of the imitation game , suc- 
cessfully played, was reported in a newsletter of the artificial 
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intelligence group of the Association for Computing Machinery. 

You may know of the MAD DOCTOR program, written by Joseph Weizenbaum 
of MIT, which does converse with a client, and which assumes part 
of the role of a nondirective counselor. If a client at the 
terminal types, ”1 am worried about so-and-so , " ‘ the computer will 
respond, "How long have you been worried about so-and-so? 11 In 
other words, the MAD DOCTOR is in the professional school (or the 
alleged school) of Carl Rogers. In Cambridge, a computer scientist 
at Bolt, Beranek and Newman had been recently working with the 
DOCTOR program, when a company vice president came to the machine. 
Dan Bobrow, whose terminal it was, tells what happened: 



The vice president found a note on the PDP-1 teletype 
that read: 11 TYPE ALL MESSAGES INTO THIS TELETYPE, AND END 
THEM WITH A PERIOD. 11 This note was left there by one of 
the programming bums who had just succeeded in making run 
a LISP version of the DOCTOR program (based on ELIZA as 
developed by Weisenbaum) . However , the VP thought that 
this note indicated that this teletype was linked with 
the one in my home, since I often used the computer all 
night. Therefore, he started typing to "me” but in 
fact was typing to this version of the DOCTOR program. 

The following conversation ensued: 



VP: DO YOU THINK THAT I CAN USE THE COMPUTER 

THIS MORNING? 

Teletype Response: WHY DO YOU ASK? 




VP: 



I MIGHT BE ABLE TO MAKE SOME ADDITIONAL 
TELCOMP SALES. 



Teletype 


Response : 


WHY AREN’T YOU SURE? 


VP: 




MY PERSPECTIVE CUSTOMERS HAVE NEVER SEEN A 
DEMONSTRATION OF THE SYSTEM. 


Teletype 


Response : 


DOES THIS MEAN VERY MUCH TO YOU? 


VP: 




OF COURSE IT DOES. 


Teletype 


Response : 


YOU SEEM VERY POSITIVE ABOUT THAT. 


VP: 




THIS IS RIDICULOUS. 


Teletype 


Response : 


WHAT DO YOU MEAN THIS IS RIDICULOUS? 


VP: 




IS THERE ANY REASON WHY I CAN’T USE THE 
COMPUTER NOW? 


Teletype 


Response : 


TELL ME MORE ABOUT YOUR PERSPECTIVE CUSTOMERS 


VP: 




PLEASE DIAL ME UP ON 491-1850 



Note that after the last remark the VP did not obey 
instructions, and left out the period. Therefore, of 
course, the computer didn’t answer him. This so 
infuriated the VP, who thought I was playing games with 
him, that he called me up, woke me from a deep sleep, 
and said: 



VP: Why are you being so snotty to me? 

Bobrow: What do you mean why am I being snotty to you 



The VP angrily read me the dialog that "we” had 
been having, and couldn’t get any response but laughter 
from me. It took a while to convince him it really was 
the computer. 



So perhaps the Turing Test is losing remoteness faster than 
we think! 

A more general answer to the objection concerning inability 
may be found in the early writings of computer scientists, and 
in a good number of writings of very recent date . This answer 
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is based upon mathematical and logical foundations * and concerns 
what has been called the "universal machine." This is surely 
not the place to recapitulate the reasoning, but there are a 

D 

number of excellent and extensive discussions of it available. 
Briefly, it may be demonstrated that a very elementary machine, 
much simpler than our current computers, may perform tasks of 
any conceivable complexity. The sole exception to this open range 
of possibility is a certain sort of self-examination, which we 
cannot be sure the human can perform, either . 

Philosophically, there is really no deep support for bias 
against the machine, or for thinking it cannot handle the 
language necessary to proper guidance of students. 

II . Empirical Arguments 

It is fine to talk of theoretical capacity, but what of real 
demonstration of language capability appropriate to the guidance 
responsibilities? Here we shall refer briefly to a few of the 
studies which have demonstrated high promise in related fields. 

Besides the Weizenbaum work with the HAD DOCTOR and similar 
Dro grams , there are a number of question— answering systems which 
exhibit interesting features. Some of these are described in 
Computers and Thought and elsewhere. David Tiedeman, Allan Ellis, 
and the other Cambridge men involved with Information Systems 
for Vocational Decisions, have related such work directly to the 
central problems of guidance : knowledge of the student , knowledge 

of the vocational world, and the interface between them. 

Such systems are related to the field of information 
retrieval also. And the extensive work done in retrieval, in such 
systems as Gerard Salton’s SMART, constitutes a major source of 
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sophistication for eventual guidance packages. Some results are 
statistically very impressive. ^ 

Statistical analysis forms the center of certain major lines 
of work. Philip Stone and others have conducted many content- 
analytic researches in the behavioral sciences using the "In- 
quirer” system. And others have now demonstrated the effective- 

g 

ness of computers and statistics in analyzing the humanities. 

Statistical techniques are also at the heart of our own work 
in analyzing student essays. Beginning in 1965, we have tried 
to simulate the performance of a group of expert judges, evalua- 
ting student writing on a number of important variables. The 
first work was in English composition, and we showed that a com- 
puter, with a simple set of criteria but with a fairly advanced 
optimization strategy, could rate essays on traits such as 
content, organization , style , mechanics , and even creativity as 
well as could the usual experienced teacher of English. That is, 

the computer ratings resembled the expert group ratings , as much 

g 

as did the individual member ratings from that group. 

More recently, workers at Connecticut have turned to the 

evaluation of subject-matter content in what the student was 

writing. For this they have turned to essay examinations in 

various disciplines, originally in courses in Western Civilization 

in large university classes. Some results were reported by John 

McManus and others at the recent meeting of the American Educa- 

10 

tional Research Association. 

The approach used may illustrate one sort of statistical 
technique and is consistent with other strategies of the project: 
Final examinations were collected for a large number of students 



of history. The portion dealing with short items was of particu- 
lar interest, where students were asked to "identify 11 such terms 
as "manorial ism" and "Thomas Acquinas." Their answers were 
punched into cards , sorted by item and randomly reordered anew 
for each of eight independent judges, graduate students in history 
chosen by the department chairman. They were supplied by the 
professor of the course with acceptable "key" answers for the 
items . 

The performance of these judges was quite erratic, as 
measured by intercorrelations — surprisingly, because one might 
presuppose that such short answers would be obviously "either 
right or wrong." For the four items the median inter judge corre- 
lation was about .45. There were only five "grades" which the 
judges could award, and for each item examined, there were some 
responses which indeed received all five grades . The typical 
number of grades was three, with more responses receiving four 
grades than receiving only two. That humanist who believes in 
subjectivity should be happy with the human performance on this 
one ! 

The statistical approach was to find those words which 
optimally separated the high-rated responses from the low-rated 
responses, to find those terms which occur in the best responses, 
but not in the others. These were then tested (cross-validated) 
against other responses not in the generating sample. Using this 
and certain other simple statistical variables , the computer was 
able (1) to determine which answers were "relevant" to the topic 
of the question; and (2) to determine which of the relevant re- 
sponses deserved which grade. The performance was believed by 



McManus to be slightly better than that of the usual human judge 
(although this comparison is a rather tricky one to make). 

The main message of this beginning work is clear: The 

statistical approach, using a variety of techniques can reasonably 
evaluate the substantive content of student work, and not solely 
the essay traits formerly considered. 

Still another statistical demonstration of computer effec- 
tiveness in a "soft" area has been made at Connecticut. The 
Torrance Tests of Creative Thinking (TTCT) is, of course, an 
attempt to establish a standardized measurement of "creativity,” 
based upon a manual evaluation of student verbal responses to 
certain stimuli. Paulus and Renzulli simulated part of the TTCT, 
using statistical techniques rather similar to those of McManus. 
They produced a cross-validity correlation with criterion of .69 
(which would become .75 corrected for unreliability of the 
criterion). This first effort, so far applied to only one sub- 
test of the TTCT, was not yet up to human standards, but could 
become much cheaper and faster than the manual procedure , and 
has much room for improvement. 

This portion of the paper has only pointed to a few of the 
empirical results which argue strongly for the eventual effec- 
tiveness of computer language analysis in the guidance process . 

Ill . Linguistic Arguments 

The biggest limitations of the statistical methods described 
are not really in the field of statistics. Rather, they are in 
the raw material provided to the statistical programs , in the 
descriptions of the student language which the computer is able 



to generate. With better descriptions, we may expect increasingly 



effective statistical strategies, for statistics is a very 
effective and versatile computer tool. 

When we speak of descriptions of language, we must depend 
on our knowledge of language structure, hence on linguistics. 
However, we do not need to restrict ourselves to the techniques 
and theories of conventional linguists, although their work may 
be an important starting point for our research. 

When a computer is given the task of such description, 
the words will usually be looked up in dictionaries in the 
computer, and the strings will be recoded in some higher-level 
description. The entry words, those actually used by the student ^ 
are often called "terminal symbols," because they are the end 
(or at times the beginning) of the supposed generative process. 

The higher-level translations of these terminals may take on an 
almost limitless variety of forms. Some will be close to the 
ordinary "parts of speech," so that the terminal hit may be re- 
coded as N for noun, or VT for transitive verb, and so on. 
Obviously, many terminals may take on a number of such grammatical 
descriptions, just as hit may be two different parts of speech, 
and may have a number of definitions within one of those parts 
(a hit may occur in baseball or in theater) . 

Parsing systems have been implemented for computer analysis 
of language , and some of these have been tried out , in a tenta- 
tive way, on examples of student writing. In most current ver- 
sions, the program tries out each different grammatical designa- 
tion for each word, and tracks out each grammatical tree generated 
by the rules of the grammar stored in the computer. These systems 
can produce parsings which are extremely detailed and rich, and 



usually better than would ever be executed by 99% of the users 
of English. Unfortunately , however, they produce ordinarily 
far more parsings than are wanted, and automatic techniques for 
telling which are valid are not well developed. 

One of the best current systems is that of Professor Susumu 
Kuno of Harvard , who kindly parsed for us 50 sentences from high 
school essays. These sentences were independently judged to be 
"grammatical t; or "not grammatical,' 1 and we were interested to 
see whether the Kuno-Oettinger program was useful in predicting 
the human evaluation. From the crossbreaks table, the discovered 
contingency coefficient was .48, supporting strongly the rela- 
tionship between human and machine judgment. However, for each 
of the "grammatical" sentences, the program produced over 23 
parsings on the average, and the best we could hope for is that 
just one of these would be the "correct" parsing. And how do we 
tell the difference? 

Even given a perfect description of the syntax of a sentence, 
however, still harder problems lie ahead. For how are we to re- 
cognize the relationship of one parsed sentence to another? For 
instance, consider two sentences: (1) Columbus discovered 

America; and (2) The Western Hemisphere was first explored by 
Columbus. We would probably consider them acceptable alternatives 
to a single question. But how is the machine to learn this 
equivalence? 

In a way, this assault on "meaning," more direct than the 
word approaches described earlier in this paper, is the central 
problem in linguistics . Thus this work shares with linguistics 
a concern for what is called "transformational grammar" — 
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techniques for relating one statement to some transformed 
equivalent (as in active-passive verb changes). But it also 
has a great concern for problems of' dictionary ('America" and 
"Western Hemisphere"), and of background knowledge, and of logic. 

These problems are connected with much exciting material 
being treated currently in the fields of computational linguistics , 
and of artificial intelligence. Some of the most intriguing 
approaches to meaning can be read about in a new book entitled 
Semantic Information Processing. 12 In this Minsky has drawn to- 
gether a number of studies, mostly by young and recent Ph.D.s. 

A few of the feats can be briefly recounted: The system by 

Raphael performs some "understanding" on a limited subset of 
English, and makes intelligent responses to questions. A sample 
dialogue follows : 



Human : 
Teletype : 
Human : 
Teletype : 
Human : 
Teletype : 
Human : 
Teletype : 
Human : 
Teletype : 

Human : 
Teletype : 



EVERY BOY J % * A PERSON 
I UNDERSTAND 

A FINGER IS PART OF A HAND 
I UNDERSTAND 

EACH PERSON HAS TWO HANDS 

THE ABOVE SENTENCE IS AMBIGUOUS. PLEASE RE-PHRASE IT. 
THERE ARE TWO HANDS ON EACH PERSON 
I UNDERSTAND 

HOW MANY FINGERS DOES JOHN HAVE? 

THE ABOVE SENTENCE IS AMBIGUOUS. BUT I ASSUME (HAS) 
MEANS (HAS AS PARTS). I DON'T KNOW WHETHER FINGER 
IB PART OF JOHN. 

EVERY HAND HAS 5 FINGERS. 

. . . I ASSUME (HAS) MEANS (HAS AS PARTS). I UNDER- 
STAND. 
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Human: HOW MANY FINGERS DOES JOHN HAVE? 

Teletype: . . . THE ANSWER IS 10. 

And so on. In this dialogue with Raphael’s program, the human 
is clearly teaching the computer about the world, or at least 
about certain kinds of symbolic relationships representing the 
world . 

In the same book, Thomas Evans demonstrates a program which 
identifies visual analogies involving geometric figures, and 
passes an intelligence test on them at about the 15-year-old 
level. And Ross Quillian generates an impressive associational 

net for word meanings. And John McCarthy describes what he calls 

* 

* 

a "program with common sense.” 

All N -pf these "semantic” programs share a common concern 
with representing reality in symbolic strings, and then with per- 
forming information retrieval , and logical inference , about the 
data in these strings. Space is too limited to describe these 
efforts in any detail, but the alert guidance worker must see 
parallels with the thought processes of his profession. 

From all these approaches, logical, empirical, linguistic, 
there seems to be a clear message for us: We are after all, in 

our roles as counselor, guidance expert, or student, serving as 
information processors. We are presumably operating under rules, 
although we have still only a dim understanding of the rules, and 
we shall never understand the rules completely , any more than we 
shall understand completely any other part of the world around us . 
Yet we can begin to make some practical use of the computer in 
language analysis, still using only a portion of the present 
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knowledge, let alone what we shall know in the future about 
language and its use. Surely, guidance practice in the future 
must benefit from such analysis. Words are indeed our game, and 
the computer can help us play it. 
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